This is an R Markdown Notebook
Your project - Detect Anomaly in Industrial Process with Deep Learning
Deep Learning… In this lecture I would like to esplore with you the DeepLearning we can do with H2O package in R…
Data Exploration
library(plotly)
Attaching package: 㤼㸱plotly㤼㸲
The following object is masked from 㤼㸱package:ggplot2㤼㸲:
last_plot
The following object is masked from 㤼㸱package:stats㤼㸲:
filter
The following object is masked from 㤼㸱package:graphics㤼㸲:
layout
About the NORMAL dataset
NORM_2016 <- DF_NORMAL %>% select(2:11) %>% as.matrix()
summary(NORM_2016)
This dataset was selected .. after .. the period of deep process maintenance
plot_ly(z = NORM_2016, type = "surface")
About the TEST dataset
This dataset contains some values that may indicate potential anomaly
TEST_2015 <- DF_TEST %>% select(2:11) %>% as.matrix()
plot_ly(z = TEST_2015, type = "surface")
About the ANOMALY dataset
This dataset contains some values that indicate the anomaly
TEST_2017 <- DF_ANOMALY %>% select(2:11) %>% as.matrix()
plot_ly(z = TEST_2017, type = "surface")
Summary: 3 dataset were selected!
Train Deep Learning Model
# ?h2o.deeplearning
normality_model <- h2o.deeplearning(x = names(train),
model_id = "DeepLearning_id20180317",
training_frame = train,
activation = "Tanh",
autoencoder = TRUE,
hidden = c(8,5,8),
sparse = TRUE,
l1 = 1e-4,
epochs = 100)
|
| | 0%
|
|======================================== | 40%
|
|================================================================================ | 80%
|
|====================================================================================================| 100%
Check MSE
# shutdown JVM
h2o.shutdown(prompt = F)
[1] TRUE
Homework:
- try different activation function, each time calculate MSE
- try sparse false parameter, what is changing?
- try high and very high complexity of the model, calculate MSE, conclude which one is the best
- try 10:100:10 is it work?
- try 200:200 will it work? what is the resulting MSE
- try 5 hidden layers
- try replicate_training data = False, how much time is different?
- weights_column this is the observation weight, try to add 1 column with observation weights of importance! see help
Ways to productionize the model
library(h2o)
# load the model
loaded_model <- h2o.loadModel("DeepLearning_id20180317")
LS0tDQp0aXRsZTogIkNyZWF0aW5nIEFub21hbHkgRGV0ZWN0aW9uIFN5c3RlbSBmb3IgSW5kdXN0cmlhbCBQcm9jZXNzIHdpdGggRGVlcCBMZWFybmluZyINCm91dHB1dDoNCiAgaHRtbF9kb2N1bWVudDogZGVmYXVsdA0KICBodG1sX25vdGVib29rOiBkZWZhdWx0DQogIHBkZl9kb2N1bWVudDogZGVmYXVsdA0KLS0tDQoNClRoaXMgaXMgYW4gW1IgTWFya2Rvd25dKGh0dHA6Ly9ybWFya2Rvd24ucnN0dWRpby5jb20pIE5vdGVib29rDQoNCg0KIyMjIFlvdXIgcHJvamVjdCAtIERldGVjdCBBbm9tYWx5IGluIEluZHVzdHJpYWwgUHJvY2VzcyB3aXRoIERlZXAgTGVhcm5pbmcNCg0KRGVlcCBMZWFybmluZy4uLiBJbiB0aGlzIGxlY3R1cmUgSSB3b3VsZCBsaWtlIHRvIGVzcGxvcmUgd2l0aCB5b3UgdGhlIERlZXBMZWFybmluZyB3ZSBjYW4gZG8gd2l0aCBIMk8gcGFja2FnZSBpbiBSLi4uDQoNCiMjIyBEYXRhIEV4cGxvcmF0aW9uDQoNCmBgYHtyfQ0KIyBsaWJyYXJpZXMNCmxpYnJhcnkodGlkeXZlcnNlKQ0KbGlicmFyeShwbG90bHkpDQoNCiMgZGF0YSByZWFkaW5nDQpERl9OT1JNQUwgPC0gcmVhZF9yZHMoIkRBVEEtbm9ybWFsLnJkcyIpDQpERl9URVNUIDwtIHJlYWRfcmRzKCJEQVRBLXRlc3QucmRzIikNCkRGX0FOT01BTFkgPC0gcmVhZF9yZHMoIkRBVEEtYW5vbWFseS5yZHMiKQ0KDQojIG5hbWVzIG9mIHRoZSBkYXRhc2V0IGNvbHVtbnMNCm5hbWVzKERGX05PUk1BTCkNCmBgYA0KDQojIyMgQWJvdXQgdGhlIE5PUk1BTCBkYXRhc2V0DQoNCmBgYHtyfQ0KTk9STV8yMDE2IDwtIERGX05PUk1BTCAlPiUgc2VsZWN0KDI6MTEpICU+JSBhcy5tYXRyaXgoKSANCnN1bW1hcnkoTk9STV8yMDE2KQ0KDQpgYGANCg0KVGhpcyBkYXRhc2V0IHdhcyBzZWxlY3RlZCAuLiBhZnRlciAuLiB0aGUgcGVyaW9kIG9mIGRlZXAgcHJvY2VzcyBtYWludGVuYW5jZQ0KDQpgYGB7cn0NCnBsb3RfbHkoeiA9IE5PUk1fMjAxNiwgdHlwZSA9ICJzdXJmYWNlIikNCmBgYA0KDQojIyMgQWJvdXQgdGhlIFRFU1QgZGF0YXNldA0KDQpUaGlzIGRhdGFzZXQgY29udGFpbnMgc29tZSB2YWx1ZXMgdGhhdCBtYXkgaW5kaWNhdGUgcG90ZW50aWFsIGFub21hbHkNCg0KYGBge3J9DQpURVNUXzIwMTUgPC0gREZfVEVTVCAlPiUgc2VsZWN0KDI6MTEpICU+JSBhcy5tYXRyaXgoKQ0KcGxvdF9seSh6ID0gVEVTVF8yMDE1LCB0eXBlID0gInN1cmZhY2UiKQ0KDQpgYGANCg0KIyMjIEFib3V0IHRoZSBBTk9NQUxZIGRhdGFzZXQNCg0KVGhpcyBkYXRhc2V0IGNvbnRhaW5zIHNvbWUgdmFsdWVzIHRoYXQgaW5kaWNhdGUgdGhlIGFub21hbHkNCg0KYGBge3J9DQoNClRFU1RfMjAxNyA8LSBERl9BTk9NQUxZICU+JSBzZWxlY3QoMjoxMSkgJT4lIGFzLm1hdHJpeCgpDQpwbG90X2x5KHogPSBURVNUXzIwMTcsIHR5cGUgPSAic3VyZmFjZSIpDQoNCmBgYA0KDQpTdW1tYXJ5OiAzIGRhdGFzZXQgd2VyZSBzZWxlY3RlZCENCg0KDQoNCg0KDQoNCg0KDQojIyMjIFRyYWluIERlZXAgTGVhcm5pbmcgTW9kZWwNCg0KDQoNCg0KDQpgYGB7cn0NCmxpYnJhcnkoaDJvKQ0KbGlicmFyeSh0aWR5dmVyc2UpDQpsaWJyYXJ5KHBsb3RseSkNCmgyby5pbml0KG50aHJlYWRzID0gMikNCg0KdHJhaW4gPC0gYXMuaDJvKHggPSBOT1JNXzIwMTYsIGRlc3RpbmF0aW9uX2ZyYW1lID0gInRyYWluIikNCnRlc3QgPC0gYXMuaDJvKHggPSBURVNUXzIwMTUsIGRlc3RpbmF0aW9uX2ZyYW1lID0gInRlc3QiKQ0KYW5vbWFseSA8LSBhcy5oMm8oeCA9IFRFU1RfMjAxNywgZGVzdGluYXRpb25fZnJhbWUgPSAiYW5vbWFseSIpDQoNCiMgP2gyby5kZWVwbGVhcm5pbmcNCg0Kbm9ybWFsaXR5X21vZGVsIDwtIGgyby5kZWVwbGVhcm5pbmcoeCA9IG5hbWVzKHRyYWluKSwgDQogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgbW9kZWxfaWQgPSAiRGVlcExlYXJuaW5nX2lkMjAxODAzMTciLA0KICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIHRyYWluaW5nX2ZyYW1lID0gdHJhaW4sIA0KICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIGFjdGl2YXRpb24gPSAiVGFuaCIsIA0KICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIGF1dG9lbmNvZGVyID0gVFJVRSwgDQogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgaGlkZGVuID0gYyg4LDUsOCksIA0KICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIHNwYXJzZSA9IFRSVUUsDQogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgbDEgPSAxZS00LCANCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICBlcG9jaHMgPSAxMDApDQoNCiMgc2F2ZSB0aGlzIG1vZGVsIA0KaDJvLnNhdmVNb2RlbChub3JtYWxpdHlfbW9kZWwsIGdldHdkKCkpDQoNCmBgYA0KDQojIyMjIHJlY29uc3RydWN0IGRhdGFzZXQNCg0KYGBge3J9DQojIHJlY3JlYXRlIA0KdGVzdF9yZWNvbiA8LSBoMm8ucHJlZGljdChub3JtYWxpdHlfbW9kZWwsIHRyYWluKSAlPiUgYXMubWF0cml4KCkNCnBsb3RfbHkoeiA9IHRlc3RfcmVjb24sIHR5cGUgPSAic3VyZmFjZSIpDQpgYGANCg0KIyMjIyBDaGVjayBNU0UNCg0KDQpgYGB7cn0NCg0KbXNlX25vcm0gPC0gaDJvLmFub21hbHkobm9ybWFsaXR5X21vZGVsLCB0cmFpbikgJT4lIGFzLmRhdGEuZnJhbWUoKQ0KbXNlX3Rlc3QgPC0gaDJvLmFub21hbHkobm9ybWFsaXR5X21vZGVsLCB0ZXN0KSAlPiUgYXMuZGF0YS5mcmFtZSgpDQptc2VfYW5vbSA8LSBoMm8uYW5vbWFseShub3JtYWxpdHlfbW9kZWwsIGFub21hbHkpICU+JSBhcy5kYXRhLmZyYW1lKCkNCg0KbXNlX25vcm0kbGFiZWwgPC0gIm5vcm1hbCINCm1zZV90ZXN0JGxhYmVsIDwtICJ0ZXN0Ig0KbXNlam9pbmVkIDwtIHJiaW5kKG1zZV9ub3JtLCBtc2VfdGVzdCkNCm1zZWpvaW5lZCRpbmRleCA8LSAxOm5yb3cobXNlam9pbmVkKQ0KIyBwbG90X2x5KHogPSBURVNUXzIwMTUsIHR5cGUgPSAic3VyZmFjZSIpDQpnZ3Bsb3QobXNlam9pbmVkLCBhZXMoeCA9IGluZGV4LCB5ID0gUmVjb25zdHJ1Y3Rpb24uTVNFLCBjb2wgPSBhcy5mYWN0b3IobGFiZWwpKSkgKyBnZW9tX2xpbmUoKQ0KDQpwbG90LnRzKG1zZV9ub3JtKQ0KcGxvdC50cyhtc2VfdGVzdCkNCnBsb3QudHMobXNlX2Fub20pDQoNCg0KbXNlX2Fub20kbGFiZWwgPC0gImFub21hbHkiDQptc2VfYWxsIDwtIHJiaW5kKG1zZV9ub3JtLG1zZV90ZXN0LCBtc2VfYW5vbSkNCm1zZV9hbGwkaW5kZXggPC0gMTpucm93KG1zZV9hbGwpDQpnZ3Bsb3QobXNlX2FsbCwgYWVzKHggPSBpbmRleCwgeSA9IFJlY29uc3RydWN0aW9uLk1TRSwgY29sID0gYXMuZmFjdG9yKGxhYmVsKSkpICsgZ2VvbV9saW5lKCkNCg0KYGBgDQoNCg0KYGBge3J9DQojIHNodXRkb3duIEpWTQ0KaDJvLnNodXRkb3duKHByb21wdCA9IEYpDQpgYGANCg0KDQpIb21ld29yazoNCg0KLSB0cnkgZGlmZmVyZW50IGFjdGl2YXRpb24gZnVuY3Rpb24sIGVhY2ggdGltZSBjYWxjdWxhdGUgTVNFDQotIHRyeSBzcGFyc2UgZmFsc2UgcGFyYW1ldGVyLCB3aGF0IGlzIGNoYW5naW5nPw0KLSB0cnkgaGlnaCBhbmQgdmVyeSBoaWdoIGNvbXBsZXhpdHkgb2YgdGhlIG1vZGVsLCBjYWxjdWxhdGUgTVNFLCBjb25jbHVkZSB3aGljaCBvbmUgaXMgdGhlIGJlc3QNCi0gdHJ5IDEwOjEwMDoxMCBpcyBpdCB3b3JrPw0KLSB0cnkgMjAwOjIwMCB3aWxsIGl0IHdvcms/IHdoYXQgaXMgdGhlIHJlc3VsdGluZyBNU0UNCi0gdHJ5IDUgaGlkZGVuIGxheWVycw0KLSB0cnkgcmVwbGljYXRlX3RyYWluaW5nIGRhdGEgPSBGYWxzZSwgaG93IG11Y2ggdGltZSBpcyBkaWZmZXJlbnQ/DQotIHdlaWdodHNfY29sdW1uIHRoaXMgaXMgdGhlIG9ic2VydmF0aW9uIHdlaWdodCwgdHJ5IHRvIGFkZCAxIGNvbHVtbiB3aXRoIG9ic2VydmF0aW9uIHdlaWdodHMgb2YgaW1wb3J0YW5jZSEgc2VlIGhlbHANCg0KDQoNCiMjIyBXYXlzIHRvIHByb2R1Y3Rpb25pemUgdGhlIG1vZGVsDQoNCmBgYHtyfQ0KbGlicmFyeShoMm8pDQojIGxvYWQgdGhlIG1vZGVsDQpsb2FkZWRfbW9kZWwgPC0gaDJvLmxvYWRNb2RlbCgiRGVlcExlYXJuaW5nX2lkMjAxODAzMTciKQ0KDQpgYGANCg0K